Skip to content

feat(context): make per-section char limits configurable and proportional#4

Merged
nehraa merged 3 commits into
mainfrom
feat/configurable-context-limits
May 25, 2026
Merged

feat(context): make per-section char limits configurable and proportional#4
nehraa merged 3 commits into
mainfrom
feat/configurable-context-limits

Conversation

@nehraa

@nehraa nehraa commented May 18, 2026

Copy link
Copy Markdown
Owner

Summary

  • Add primaryDocLimit, primaryFileLimit, relatedDocLimit, relatedFileLimit to RetrievalConfig (all optional — null = proportional default)
  • Add env var support: CODERAG_PRIMARY_DOC_LIMIT, CODERAG_PRIMARY_FILE_LIMIT, CODERAG_RELATED_DOC_LIMIT, CODERAG_RELATED_FILE_LIMIT
  • Replace hardcoded const section limits in prompt.ts with a deriveSectionLimits() function that scales proportionally with maxContextChars
  • fitRelatedContexts() now keeps a 200-char snippet instead of dropping to empty when budget runs out
  • buildContextPackage() and buildMultiHopContextPackage() now return { context, limits } so prompt formatting gets the actual configured limits

Problem

Previously PRIMARY_DOC_CHAR_LIMIT = 1_200, PRIMARY_FILE_LIMIT = 4_000, RELATED_DOC_LIMIT = 320, RELATED_FILE_LIMIT = 1_200 were hardcoded compile-time constants in prompt.ts. They:

  1. Ignored whatever maxContextChars was configured
  2. Were unreachable for models with 200K+ context windows
  3. Couldn't be tuned without recompiling

Additionally, when the context budget exhausted, fitRelatedContexts() dropped file content to "" entirely — losing all signal for the LLM.

Solution

deriveSectionLimits() computes defaults proportional to maxContextChars:

  • primaryDoc → 7.5% of maxContextChars (e.g., 1,200 @ 16K, 12,000 @ 160K)
  • primaryFile → 25% of maxContextChars
  • relatedDoc → 2% of maxContextChars
  • relatedFile → 7.5% of maxContextChars

Explicit overrides (CODERAG_PRIMARY_DOC_LIMIT etc.) always take precedence. For 200K context models, set CODERAG_MAX_CONTEXT_CHARS=160000 and tune explicit overrides to taste.

Test plan

  • All 214 tests pass
  • context-builder.test.ts updated for new buildContextPackage return shape
  • prompt.test.ts updated to pass SectionLimits to buildMessages

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Configurable per-section context limits (primary/related doc and file excerpt sizes) and exposure of embedding provider token limits.
  • Improvements

    • Better context preservation: when budget is exhausted, small file snippets are kept instead of dropping content.
    • Prompt builders now accept explicit size limits for consistent truncation.
  • Changes

    • Default Gemini embedding model updated to models/gemini-embedding-2.
  • Documentation

    • README and env example updated to reflect the new default model.
  • Tests

    • Tests updated to exercise explicit limits and the new default embedding model.

Review Change Stack

…onal

Previously, hardcoded constants in prompt.ts (primaryDoc=1200, primaryFile=4000,
relatedDoc=320, relatedFile=1200) overrode any budget configuration and were
unreachable for models with 200K+ context windows. Additionally, when the context
budget exhausted, related file content was dropped entirely instead of preserving
a snippet.

Changes:
- Add primaryDocLimit, primaryFileLimit, relatedDocLimit, relatedFileLimit to
  RetrievalConfig (all optional, null = use proportional default)
- Add CODERAG_PRIMARY_DOC_LIMIT, CODERAG_PRIMARY_FILE_LIMIT,
  CODERAG_RELATED_DOC_LIMIT, CODERAG_RELATED_FILE_LIMIT env var support
- Replace hardcoded SectionLimits constants with a deriveSectionLimits() function
  that scales defaults proportionally with maxContextChars
- fitRelatedContexts() now keeps a 200-char snippet instead of dropping to empty
  when budget runs out
- buildContextPackage and buildMultiHopContextPackage now return { context, limits }
  so prompt.ts gets the actual configured limits

For 200K context models, set CODERAG_MAX_CONTEXT_CHARS=160000 and explicit overrides
to taste. For small-context models, the warning message now reflects the actual
configured limits rather than compile-time constants.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 18, 2026 15:32
@qodo-code-review

Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai

coderabbitai Bot commented May 18, 2026

Copy link
Copy Markdown

Warning

Review limit reached

@Copilot, we couldn't start this review because you've used your available PR reviews for now.

Your plan includes 1 review of capacity. Refill in 22 minutes and 21 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 71375b86-be33-415a-ae0c-dd58b12ca206

📥 Commits

Reviewing files that changed from the base of the PR and between a352ae2 and 42e2dd1.

📒 Files selected for processing (4)
  • src/llm/context-builder.ts
  • src/llm/multi-hop-context-builder.ts
  • src/test/context-builder.test.ts
  • src/test/prompt.test.ts
📝 Walkthrough

Walkthrough

This PR parameterizes prompt/truncation sizes via a new SectionLimits contract derived from retrieval config, returns computed limits from context builders, threads limits through message builders and service code, and changes exhausted related-content handling to retain short snippets with updated warnings.

Changes

Parameterized Context Limits and Snippet Retention

Layer / File(s) Summary
Type contracts and configuration schema
src/llm/prompt.ts, src/types.ts, src/llm/context-builder.ts
SectionLimits interface defines per-section character limits; retrieval schema adds optional primaryDocLimit, primaryFileLimit, relatedDocLimit, relatedFileLimit; imports adjusted.
Limit computation from retrieval config
src/llm/context-builder.ts, src/llm/multi-hop-context-builder.ts
deriveSectionLimits() computes per-section limits from maxContextChars with proportional defaults and respects explicit overrides; context builders now return { context, limits }; multi-hop retrieval typing narrowed for used fields.
Budget exhaustion and snippet retention
src/llm/context-builder.ts, src/llm/multi-hop-context-builder.ts
When related-context budget is exhausted, builders keep a bounded snippet (≤200 chars) instead of dropping file content and emit distinct warnings for snippet-kept vs dropped content.
Message formatting with parameterized limits
src/llm/prompt.ts
Formatting functions accept limits and use them to size primary/related doc/file excerpts; formatWarnings encapsulates warning truncation limits; buildMessages() and buildMultiHopMessages() now require limits.
Service integration of limits
src/service/coderag.ts
CodeRag.query() now destructures { context, limits } from single- and multi-hop context builders and passes limits into message builders for LLM requests.
Environment variable configuration for limits
src/service/config.ts
loadSerializableConfig() parses new env overrides CODERAG_PRIMARY_DOC_LIMIT, CODERAG_PRIMARY_FILE_LIMIT, CODERAG_RELATED_DOC_LIMIT, CODERAG_RELATED_FILE_LIMIT; embedding Gemini default updated to models/gemini-embedding-2.
Test coverage for limits and snippets
src/test/context-builder.test.ts, src/test/prompt.test.ts, src/test/gemini-embedder.test.ts
Tests updated to destructure context from new builder return shapes, verify snippet retention behavior and warning text on exhaustion, pass explicit testLimits to buildMessages(), and update Gemini embedder expectations to models/gemini-embedding-2.
Embedding provider limits and defaults
src/indexer/documents.ts, src/indexer/*-embedder.ts, .env.example, README.md, src/cli/setup-wizard.ts
Add CHARS_PER_TOKEN heuristic and use embeddingProvider.maxInputTokens * CHARS_PER_TOKEN for truncation; add maxInputTokens on providers (LocalHash, Gemini, Onnx), update Gemini default model to models/gemini-embedding-2, and reflect these in docs/tests/.env/setup.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • nehraa/CodeRag#1: Embedding provider additions/changes that overlap with provider defaults and token-limit handling.
  • nehraa/CodeRag#3: Prior multi-hop retrieval changes related to context assembly and budgeting.

Poem

🐰 I nibble limits in tidy rows,

snippets saved where the cold wind blows,
configs whisper per-section art,
builders hand limits to every part,
tiny excerpts kept — a thoughtful start.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat(context): make per-section char limits configurable and proportional' accurately summarizes the primary change: adding configurable per-section character limits with proportional defaults.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/configurable-context-limits

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces dynamic context budgeting by allowing per-section character limits to be derived from the total context budget or explicitly overridden via configuration. It also improves the handling of exhausted context budgets by retaining a small snippet of file content instead of dropping it entirely. Feedback focuses on code maintainability, specifically recommending the reuse of the MultiHopRetrievalResult type and the deriveSectionLimits function to avoid duplication, as well as extracting the hardcoded snippet length into a shared constant.

Comment on lines 4 to 12
ContextPackage,
GraphSnapshot,
IndexedNodeDocument,
MultiHopRetrievalResult,
RetrievedNodeContext,
RetrievalConfig
} from "../types.js";
import type { RetrievedNodeContext } from "../types.js";
import type { SectionLimits } from "./prompt.js";
import { FileCache } from "../store/file-cache.js";
import { createRetrievedNodeContext } from "../retrieval/page-index.js";

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The MultiHopRetrievalResult type was removed from the imports but is still logically applicable to the retrieval results used in this file. Additionally, deriveSectionLimits is now duplicated from context-builder.ts and should be imported instead.

  ContextPackage,
  GraphSnapshot,
  IndexedNodeDocument,
  RetrievedNodeContext,
  RetrievalConfig,
  MultiHopRetrievalResult
} from "../types.js";
import type { SectionLimits } from "./prompt.js";
import { deriveSectionLimits } from "./context-builder.js";
import { FileCache } from "../store/file-cache.js";
import { createRetrievedNodeContext } from "../retrieval/page-index.js";

Comment thread src/llm/multi-hop-context-builder.ts Outdated
const buildMultiHopGraphSummary = (
subQuestions: string[],
retrievalResult: MultiHopRetrievalResult,
retrievalResult: { deduplicatedNodes: BlueprintNode[]; retrievalMetadata: Array<{ subQuestion: string; primaryNode?: BlueprintNode; relatedNodes: BlueprintNode[]; filesReferenced: string[] }> },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a complex inline type for retrievalResult reduces readability and maintainability. Since MultiHopRetrievalResult is already defined in src/types.ts and matches this structure, it should be used here.

Suggested change
retrievalResult: { deduplicatedNodes: BlueprintNode[]; retrievalMetadata: Array<{ subQuestion: string; primaryNode?: BlueprintNode; relatedNodes: BlueprintNode[]; filesReferenced: string[] }> },
retrievalResult: MultiHopRetrievalResult,

Comment thread src/llm/multi-hop-context-builder.ts Outdated
question: string,
subQuestions: string[],
retrievalResult: MultiHopRetrievalResult,
retrievalResult: { deduplicatedNodes: BlueprintNode[]; primaryNodes: Array<BlueprintNode | undefined>; expandedNodes: BlueprintNode[]; retrievalMetadata: Array<{ subQuestion: string; primaryNode?: BlueprintNode; relatedNodes: BlueprintNode[]; filesReferenced: string[] }> },

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a complex inline type for retrievalResult reduces readability and maintainability. Since MultiHopRetrievalResult is already defined in src/types.ts and matches this structure, it should be used here.

Suggested change
retrievalResult: { deduplicatedNodes: BlueprintNode[]; primaryNodes: Array<BlueprintNode | undefined>; expandedNodes: BlueprintNode[]; retrievalMetadata: Array<{ subQuestion: string; primaryNode?: BlueprintNode; relatedNodes: BlueprintNode[]; filesReferenced: string[] }> },
retrievalResult: MultiHopRetrievalResult,

Comment thread src/llm/multi-hop-context-builder.ts Outdated
Comment on lines +75 to +83
const deriveSectionLimits = (retrieval: RetrievalConfig): SectionLimits => {
const mcc = retrieval.maxContextChars;
return {
primaryDoc: retrieval.primaryDocLimit ?? Math.round((mcc / 16000) * 1200),
primaryFile: retrieval.primaryFileLimit ?? Math.round((mcc / 16000) * 4000),
relatedDoc: retrieval.relatedDocLimit ?? Math.round((mcc / 16000) * 320),
relatedFile: retrieval.relatedFileLimit ?? Math.round((mcc / 16000) * 1200)
};
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The deriveSectionLimits function is duplicated from src/llm/context-builder.ts. To ensure consistency and ease of maintenance, this function should be exported from a common location (or just from context-builder.ts) and imported here.

Comment thread src/llm/context-builder.ts Outdated
if (budget <= 0) {
warnings.push(`Dropped file content for ${context.filePath} because the context budget was exhausted.`);
// Never drop entirely — truncate to at least a snippet so the LLM retains context.
const snippetLength = Math.min(200, context.fullFileContent.length);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The magic number 200 for the snippet length is hardcoded in both context-builder.ts and multi-hop-context-builder.ts. This should be defined as a shared constant (e.g., MIN_SNIPPET_CHARS) to improve readability and make it easier to tune the snippet size globally.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 45e1b99785

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 114 to 126
if (budget <= 0) {
warnings.push(`Dropped file content for ${context.filePath} because the context budget was exhausted.`);
// Never drop entirely — truncate to at least a snippet so the LLM retains context.
const snippetLength = Math.min(200, context.fullFileContent.length);
const snippet = context.fullFileContent.slice(0, snippetLength);
warnings.push(
snippetLength > 0
? `File content exhausted for ${context.filePath}; kept first ${snippetLength} chars as snippet.`
: `Dropped file content for ${context.filePath} because the context budget was exhausted.`
);
fittedContexts.push({
...context,
fullFileContent: ""
fullFileContent: snippet
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep exhausted related snippets within the budget

When a primary file consumes the remaining context budget (or maxContextChars is very small), this branch now adds up to 200 characters for every related node anyway. For queries with several related files, the returned context and the prompt built from it can exceed the configured retrieval.maxContextChars, whereas the previous behavior dropped content once the budget was exhausted. If snippets are desired, they need to be accounted for in the remaining budget or limited by an explicit separate cap.

Useful? React with 👍 / 👎.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/test/context-builder.test.ts (1)

92-103: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add one direct assertion for the new limits return contract.

These tests only destructure context, so regressions in buildContextPackage(...).limits would not be caught even though limits are now part of retrieval behavior. Add at least one explicit limits assertion in this suite.

As per coding guidelines, New indexing, retrieval, transport, or MCP behavior must include direct test coverage.

Also applies to: 151-162, 213-224

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/test/context-builder.test.ts` around lines 92 - 103, The test calls
buildContextPackage(...) but only destructures context, so add an explicit
assertion for the new limits return field to prevent regressions; locate the
test invocation of buildContextPackage in this file (the call that currently
destructures { context }) and update it to also capture limits (e.g., const {
context, limits } = await buildContextPackage(...)) then add one assertion
checking expected limits shape/values (for example assert.strictEqual or expect
toMatchObject against the known limits like { topK: 4, rerankK: 2,
maxContextChars: 18 } or whichever canonical values this test should enforce);
apply the same pattern to the other two test blocks referenced (around lines
151-162 and 213-224) to ensure each suite directly covers the limits contract.
🧹 Nitpick comments (2)
src/llm/multi-hop-context-builder.ts (1)

75-83: 💤 Low value

Consider importing deriveSectionLimits from context-builder.ts instead of duplicating it.

The function exists in both files with identical logic. Since context-builder.ts already exports it (lines 48–62), the multi-hop context builder should import it rather than maintain a separate copy. This eliminates duplication and reduces maintenance burden.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/llm/multi-hop-context-builder.ts` around lines 75 - 83, Remove the
duplicated deriveSectionLimits implementation from multi-hop-context-builder and
import the shared deriveSectionLimits exported by context-builder instead;
update the top of multi-hop-context-builder.ts to import { deriveSectionLimits }
(and any needed types like RetrievalConfig/SectionLimits if not already
imported), delete the local function block, and ensure all references in
buildMultiHopContext (or whichever functions call deriveSectionLimits) now use
the imported symbol so there is a single source of truth.
src/test/prompt.test.ts (1)

51-56: ⚡ Quick win

Use deliberately small/distinct limits in one case to verify limit application behavior.

With near-default values, these tests mainly validate the new function signature, not that buildMessages actually applies caller-provided limits. Add one case with tiny limits and assert truncation/excerpt length.

As per coding guidelines, New indexing, retrieval, transport, or MCP behavior must include direct test coverage.

Also applies to: 59-60, 80-81, 107-108

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/test/prompt.test.ts` around lines 51 - 56, Add a focused test case in
src/test/prompt.test.ts that uses a deliberately tiny and distinct limits object
(e.g., testLimits with very small primaryDoc/primaryFile/relatedDoc/relatedFile
values) and passes it into buildMessages to verify that caller-provided limits
are actually applied; assert that returned message parts (titles, excerpts, or
content slices) are truncated to the expected lengths (check exact
excerpt/truncation lengths) and add similar assertions for the other existing
cases flagged (the ones around the other testLimits uses) so the test suite
verifies truncation behavior rather than only the signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/llm/context-builder.ts`:
- Around line 52-55: The computed proportional defaults primaryDocDefault,
primaryFileDefault, relatedDocDefault and relatedFileDefault can become 0 for
small max-context values (mcc); change each calculation to clamp the result to
at least 1 (e.g., wrap the existing Math.round((mcc / 16000) * ...) expression
with Math.max(1, ...)) so no section limit becomes zero and truncation remains
valid.

---

Outside diff comments:
In `@src/test/context-builder.test.ts`:
- Around line 92-103: The test calls buildContextPackage(...) but only
destructures context, so add an explicit assertion for the new limits return
field to prevent regressions; locate the test invocation of buildContextPackage
in this file (the call that currently destructures { context }) and update it to
also capture limits (e.g., const { context, limits } = await
buildContextPackage(...)) then add one assertion checking expected limits
shape/values (for example assert.strictEqual or expect toMatchObject against the
known limits like { topK: 4, rerankK: 2, maxContextChars: 18 } or whichever
canonical values this test should enforce); apply the same pattern to the other
two test blocks referenced (around lines 151-162 and 213-224) to ensure each
suite directly covers the limits contract.

---

Nitpick comments:
In `@src/llm/multi-hop-context-builder.ts`:
- Around line 75-83: Remove the duplicated deriveSectionLimits implementation
from multi-hop-context-builder and import the shared deriveSectionLimits
exported by context-builder instead; update the top of
multi-hop-context-builder.ts to import { deriveSectionLimits } (and any needed
types like RetrievalConfig/SectionLimits if not already imported), delete the
local function block, and ensure all references in buildMultiHopContext (or
whichever functions call deriveSectionLimits) now use the imported symbol so
there is a single source of truth.

In `@src/test/prompt.test.ts`:
- Around line 51-56: Add a focused test case in src/test/prompt.test.ts that
uses a deliberately tiny and distinct limits object (e.g., testLimits with very
small primaryDoc/primaryFile/relatedDoc/relatedFile values) and passes it into
buildMessages to verify that caller-provided limits are actually applied; assert
that returned message parts (titles, excerpts, or content slices) are truncated
to the expected lengths (check exact excerpt/truncation lengths) and add similar
assertions for the other existing cases flagged (the ones around the other
testLimits uses) so the test suite verifies truncation behavior rather than only
the signature.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1a3d15e-31df-4dc6-b9cf-08d1c8fdbe57

📥 Commits

Reviewing files that changed from the base of the PR and between ba3e43d and 45e1b99.

📒 Files selected for processing (8)
  • src/llm/context-builder.ts
  • src/llm/multi-hop-context-builder.ts
  • src/llm/prompt.ts
  • src/service/coderag.ts
  • src/service/config.ts
  • src/test/context-builder.test.ts
  • src/test/prompt.test.ts
  • src/types.ts

Comment thread src/llm/context-builder.ts Outdated

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes context section character limits configurable (via config and env vars) and proportional to maxContextChars, and threads the derived limits through context-building into prompt formatting. It also changes budget-exhaustion behavior so related file content retains a short snippet instead of being dropped entirely.

Changes:

  • Extend RetrievalConfig + config loading to support per-section char limit overrides (including env vars).
  • Replace hardcoded prompt section limits with passed-in SectionLimits, returned by context builders.
  • Adjust context-budget fitting to keep a short related-file snippet when budget is exhausted, and update tests for new return shapes.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/types.ts Adds optional per-section retrieval limit fields to the config schema/type.
src/service/config.ts Adds env var overrides for the new per-section limit fields.
src/llm/context-builder.ts Derives SectionLimits from retrieval config and returns { context, limits }; keeps snippets on budget exhaustion.
src/llm/multi-hop-context-builder.ts Returns { context, limits } for multi-hop; keeps snippets on budget exhaustion; includes local section-limit derivation.
src/llm/prompt.ts Introduces SectionLimits and applies them during prompt formatting; updates message builders to require limits.
src/service/coderag.ts Wires new { context, limits } return shape into prompt message construction.
src/test/context-builder.test.ts Updates for new return shape and new snippet-on-exhaustion behavior.
src/test/prompt.test.ts Updates prompt tests to pass explicit SectionLimits.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/types.ts
Comment on lines +25 to +29
/** Per-section char limits. All optional — defaults scale with maxContextChars. */
primaryDocLimit: z.number().int().positive().optional(),
primaryFileLimit: z.number().int().positive().optional(),
relatedDocLimit: z.number().int().positive().optional(),
relatedFileLimit: z.number().int().positive().optional()
Comment thread src/service/config.ts
Comment on lines 153 to 162
retrieval: {
...baseConfig.retrieval,
topK: parseNumber(process.env.CODERAG_TOP_K) ?? baseConfig.retrieval.topK,
rerankK: parseNumber(process.env.CODERAG_RERANK_K) ?? baseConfig.retrieval.rerankK,
maxContextChars: parseNumber(process.env.CODERAG_MAX_CONTEXT_CHARS) ?? baseConfig.retrieval.maxContextChars
maxContextChars: parseNumber(process.env.CODERAG_MAX_CONTEXT_CHARS) ?? baseConfig.retrieval.maxContextChars,
primaryDocLimit: parseNumber(process.env.CODERAG_PRIMARY_DOC_LIMIT) ?? baseConfig.retrieval.primaryDocLimit,
primaryFileLimit: parseNumber(process.env.CODERAG_PRIMARY_FILE_LIMIT) ?? baseConfig.retrieval.primaryFileLimit,
relatedDocLimit: parseNumber(process.env.CODERAG_RELATED_DOC_LIMIT) ?? baseConfig.retrieval.relatedDocLimit,
relatedFileLimit: parseNumber(process.env.CODERAG_RELATED_FILE_LIMIT) ?? baseConfig.retrieval.relatedFileLimit
},
Comment thread src/llm/context-builder.ts Outdated
Comment on lines +113 to +117
for (const context of relatedContexts) {
if (budget <= 0) {
warnings.push(`Dropped file content for ${context.filePath} because the context budget was exhausted.`);
// Never drop entirely — truncate to at least a snippet so the LLM retains context.
const snippetLength = Math.min(200, context.fullFileContent.length);
const snippet = context.fullFileContent.slice(0, snippetLength);
Comment thread src/llm/multi-hop-context-builder.ts Outdated
Comment on lines +134 to +139
const fittedRelated: RetrievedNodeContext[] = [];
for (const ctx of relatedContexts) {
if (remainingBudget <= 0) {
fittedRelated.push({ ...ctx, fullFileContent: "" });
const snippetLength = Math.min(200, ctx.fullFileContent.length);
const snippet = ctx.fullFileContent.slice(0, snippetLength);
warnings.push(
Comment on lines 14 to 17
const buildMultiHopGraphSummary = (
subQuestions: string[],
retrievalResult: MultiHopRetrievalResult,
retrievalResult: { deduplicatedNodes: BlueprintNode[]; retrievalMetadata: Array<{ subQuestion: string; primaryNode?: BlueprintNode; relatedNodes: BlueprintNode[]; filesReferenced: string[] }> },
snapshot: GraphSnapshot
Comment thread src/llm/multi-hop-context-builder.ts Outdated
Comment on lines +75 to +83
const deriveSectionLimits = (retrieval: RetrievalConfig): SectionLimits => {
const mcc = retrieval.maxContextChars;
return {
primaryDoc: retrieval.primaryDocLimit ?? Math.round((mcc / 16000) * 1200),
primaryFile: retrieval.primaryFileLimit ?? Math.round((mcc / 16000) * 4000),
relatedDoc: retrieval.relatedDocLimit ?? Math.round((mcc / 16000) * 320),
relatedFile: retrieval.relatedFileLimit ?? Math.round((mcc / 16000) * 1200)
};
};
Comment thread src/llm/prompt.ts
Comment on lines 137 to 142
const formatSubQuestionSection = (
index: number,
question: string,
nodes: RetrievedNodeContext[]
nodes: RetrievedNodeContext[],
limits: SectionLimits
): string => {
nehraa and others added 2 commits May 25, 2026 11:25
- Add maxInputTokens to EmbeddingProvider interface
- Gemini embedding-2: 8192 tokens (~32K chars) vs old hardcoded 2048
- ONNX all-MiniLM-L6-v2: 256 tokens (~1K chars)
- local-hash: Infinity (no truncation)
- Update default model from embedding-001 to embedding-2
- Tests updated to reflect new model name

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot finished work on behalf of nehraa May 25, 2026 06:00

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/test/gemini-embedder.test.ts (1)

1-229: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Missing test coverage for maxInputTokens field.

The new maxInputTokens field introduced in GeminiEmbeddingProvider (line 26 of gemini-embedder.ts) is not tested. This field affects indexing behavior (used for truncation in documents.ts), and the coding guideline requires direct test coverage for new indexing, retrieval, transport, or MCP behavior.

Please add a test case verifying:

  • The maxInputTokens field exists on the provider instance
  • It has the expected value (8192)

As per coding guidelines: "New indexing, retrieval, transport, or MCP behavior must include direct test coverage"

📋 Suggested test case

Add this test to verify the maxInputTokens field:

+  it("exposes maxInputTokens for gemini-embedding-2", () => {
+    const provider = new GeminiEmbeddingProvider({ apiKey: "test-key" });
+    expect(provider.maxInputTokens).toBe(8192);
+  });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/test/gemini-embedder.test.ts` around lines 1 - 229, Add a unit test in
the existing describe("GeminiEmbeddingProvider") suite that constructs a
GeminiEmbeddingProvider (e.g., new GeminiEmbeddingProvider() or with apiKey) and
asserts the instance has a maxInputTokens property equal to 8192; reference the
GeminiEmbeddingProvider constructor and the maxInputTokens field so the test
explicitly checks provider.maxInputTokens === 8192 and fails if the value is
missing or different.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/test/gemini-embedder.test.ts`:
- Around line 1-229: Add a unit test in the existing
describe("GeminiEmbeddingProvider") suite that constructs a
GeminiEmbeddingProvider (e.g., new GeminiEmbeddingProvider() or with apiKey) and
asserts the instance has a maxInputTokens property equal to 8192; reference the
GeminiEmbeddingProvider constructor and the maxInputTokens field so the test
explicitly checks provider.maxInputTokens === 8192 and fails if the value is
missing or different.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2b7aa920-e8c1-4cf3-838b-607272cdf56b

📥 Commits

Reviewing files that changed from the base of the PR and between 45e1b99 and a352ae2.

📒 Files selected for processing (10)
  • .env.example
  • README.md
  • src/cli/setup-wizard.ts
  • src/indexer/documents.ts
  • src/indexer/embedder.ts
  • src/indexer/gemini-embedder.ts
  • src/indexer/onnx-embedder.ts
  • src/service/config.ts
  • src/test/gemini-embedder.test.ts
  • src/types.ts
✅ Files skipped from review due to trivial changes (4)
  • src/indexer/embedder.ts
  • src/indexer/onnx-embedder.ts
  • .env.example
  • README.md

@nehraa

nehraa commented May 25, 2026

Copy link
Copy Markdown
Owner Author

@copilot Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In @src/test/gemini-embedder.test.ts:

  • Around line 1-229: Add a unit test in the existing
    describe("GeminiEmbeddingProvider") suite that constructs a
    GeminiEmbeddingProvider (e.g., new GeminiEmbeddingProvider() or with apiKey) and
    asserts the instance has a maxInputTokens property equal to 8192; reference the
    GeminiEmbeddingProvider constructor and the maxInputTokens field so the test
    explicitly checks provider.maxInputTokens === 8192 and fails if the value is
    missing or different.

@nehraa nehraa merged commit 0d36196 into main May 25, 2026
1 of 2 checks passed
Copilot stopped work on behalf of nehraa due to an error May 25, 2026 06:02
@nehraa nehraa deleted the feat/configurable-context-limits branch May 25, 2026 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants